We present a novel hybrid algorithm for Bayesian network structure learning,called H2PC. It first reconstructs the skeleton of a Bayesian network and thenperforms a Bayesian-scoring greedy hill-climbing search to orient the edges.The algorithm is based on divide-and-conquer constraint-based subroutines tolearn the local structure around a target variable. We conduct two series ofexperimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which iscurrently the most powerful state-of-the-art algorithm for Bayesian networkstructure learning. First, we use eight well-known Bayesian network benchmarkswith various data sizes to assess the quality of the learned structure returnedby the algorithms. Our extensive experiments show that H2PC outperforms MMHC interms of goodness of fit to new data and quality of the network structure withrespect to the true dependence structure of the data. Second, we investigateH2PC's ability to solve the multi-label learning problem. We providetheoretical results to characterize and identify graphically the so-calledminimal label powersets that appear as irreducible factors in the jointdistribution under the faithfulness condition. The multi-label learning problemis then decomposed into a series of multi-class classification problems, whereeach multi-class variable encodes a label powerset. H2PC is shown to comparefavorably to MMHC in terms of global classification accuracy over tenmulti-label data sets covering different application domains. Overall, ourexperiments support the conclusions that local structural learning with H2PC inthe form of local neighborhood induction is a theoretically well-motivated andempirically effective learning framework that is well suited to multi-labellearning. The source code (in R) of H2PC as well as all data sets used for theempirical tests are publicly available.
展开▼